parquet 0.3.0

Apache Parquet implementation in Rust
Documentation

parquet-rs

Build Status Coverage Status License Documentation Master API docs

An Apache Parquet implementation in Rust (work in progress)

Usage

Add this to your Cargo.toml:

[dependencies]
parquet = "0.3"

and this to your crate root:

extern crate parquet;

Example usage:

use std::fs::File;
use std::path::Path;
use parquet::file::reader::{FileReader, SerializedFileReader};

let file = File::open(&Path::new("/path/to/file")).unwrap();
let reader = SerializedFileReader::new(file).unwrap();
let mut iter = reader.get_row_iter(None).unwrap();
while let Some(record) = iter.next() {
  println!("{}", record);
}

See crate documentation on available API.

Supported Parquet Version

  • Parquet-format 2.4.0

To update Parquet format to a newer version, check if parquet-format version is available. Then simply update version of parquet-format crate in Cargo.toml.

Requirements

  • Rust nightly

See Working with nightly Rust to install nightly toolchain and set it as default.

Build

Run cargo build or cargo build --release to build in release mode.

Test

Run cargo test for unit tests.

Binaries

The following binaries are provided (use cargo install to install them):

  • parquet-schema for printing Parquet file schema and metadata. Usage: parquet-schema <file-path> [verbose], where file-path is the path to a Parquet file, and optional verbose is the boolean flag that allows to print full metadata or schema only (when not specified only schema will be printed).

  • parquet-read for reading records from a Parquet file. Usage: parquet-read <file-path> [num-records], where file-path is the path to a Parquet file, and num-records is the number of records to read from a file (when not specified all records will be printed).

Benchmarks

Run cargo bench for benchmarks.

Docs

To build documentation, run cargo doc --no-deps. To compile and view in the browser, run cargo doc --no-deps --open.

License

Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0.